Firebolt
Querying Overview
Firebolt uses SQL as the basis for its querying language. It includes extensions for semi-structured data, arrays, geospatial data, and JSON. The primary reference for Firebolt SQL is at https://docs.firebolt.io/sql_reference/commands/queries/select.html
For FireBolt core the elements listed in the generic “Collections” drop down for the Data Source Designer and other query tools do not include any tables or views with a table_schema of 'information_schema'. This removes tables and views which are not queryable, but normally listed. See https://docs.firebolt.io/firebolt-core/firebolt-core-differences for more details. A general purpose query would be
SELECT* FROM information_schema.tables
Querying Considerations
Remember that all Firebolt column names are lower cased in the answer set. This matters when referencing row values by field (column) name.
Handling JSON Data
Qarbine natively queries Firebolt which can return answer sets with complex JSON objects. The JSON data is stored in a TEXT column. The JSON functions provide a way to extract specific values from JSON documents, particularly from those stored in a TEXT column. These functions extract part of the document, but preserve the original data in the result:
- JSON_EXTRACT
- JSON_EXTRACT_ARRAY
- JSON_POINTER_EXTRACT_VALUES
These functions convert a JSON value to a SQL value:
- JSON_VALUE
- JSON_VALUE_ARRAY
- JSON_POINTER_EXTRACT_KEYS
For more details see https://docs.firebolt.io/reference-sql/functions-reference/json.
The following discussion is based on the table definition and populating discussed at
https://docs.firebolt.io/guides/loading-data/working-with-semi-structured-data/load-json-data
The statements to create the table and populate it are
-- Create a staging table for raw JSON data with one JSON object per row
DROP TABLE IF EXISTS doc_visits_source;
CREATE TABLE doc_visits_source (
raw_json TEXT
);
-- Insert raw JSON data as individual rows
INSERT INTO doc_visits_source (raw_json)
VALUES
('{"id": 1, "StartTime": "2020-01-06 17:00:00", "Duration": 450, "tags": ["summer-sale", "sports"], "user_agent": {"agent": "Mozilla/5.0", "platform": "Windows NT 6.1", "resolution": "1024x4069"}}'),
('{"id": 2, "StartTime": "2020-01-05 12:00:00", "Duration": 959, "tags": ["gadgets", "audio"], "user_agent": {"agent": "Safari", "platform": "iOS 14"}}');
Running
SELECT * from doc_visits_source
results in the following answer set
The 2 rows are just single columns with JSON strings.
Firebolt provides several functions to interact with JSON includes ones to extract specific values.Here is SQL to extract the user_agent sub-text.
SELECT *
JSON_POINTER_EXTRACT(raw_json, '/user_agent')::TEXT as user_agent
from doc_visits_source
Sample results are shown below.
Note that these values are just JSON strings and not JSON objects.
Qarbine provides “pragmas” to manipulate answer set contents. These are described in detail within the primary Data Source Designer document. Qarbine can easily convert the JSON strings into JSON objects for use by template processing and other processes. The first one to consider is shown below.
#pragma convertToObject raw_json
select *
from doc_visits_source
limit 25
This results in
The first row is shown below.
The formula in the template to access the Duration would be
#raw_Json.Duration
The data of interest is one level deeper than is likely wanted so another pragma stage can be added as shown below.
#pragma convertToObject raw_json
#pragma pullFieldsUp raw_json
select *
from doc_visits_source
limit 25
This results in
The first row is shown below.
The formula in the template to access the Duration is now simpler
#Duration
Array Handling
Given a table definition of
CREATE FACT TABLE ecommerce (
product_id INTEGER NOT NULL,
product_name TEXT,
tags ARRAY(TEXT) -- ARRAY of strings, e.g., for product tags
)
PRIMARY INDEX product_id;
and the table populating statement of
INSERT INTO ecommerce (product_id, product_name, tags)
VALUES
(1, 'Laptop', ['electronics', 'portable', 'fast'] ),
(2, 'Smartphone', ['electronics', 'mobile', 'camera'] ),
(3, 'Headphones', ['audio', 'portable'] );
The query
SELECT * FROM ecommerce
results in the answer set
The first row looks like the following
Notice the tags are in their natural array format.
Vector Searching
Firebolt provides support for vector search enabling efficient retrieval of relevant data using vector embeddings. Firebolt does not generate embeddings itself but is designed to work with embeddings from any external model. Firebolt includes vector functions for similarity and distance.
Similarity functions:
- VECTOR_COSINE_SIMILARITY(): Measures the cosine similarity between two vectors.
- VECTOR_INNER_PRODUCT(): Computes the inner product between vectors.
Distance functions:
- VECTOR_COSINE_DISTANCE()
- VECTOR_EUCLIDEAN_DISTANCE()
- VECTOR_MANHATTAN_DISTANCE()
- VECTOR_SQUARED_EUCLIDEAN_DISTANCE()
These functions allow you to order results by similarity or distance, depending on your use case.
Firebolt integrates vector search with structured querying, so you can run hybrid queries that combine vector similarity with traditional filtering (e.g., filtering by chunking strategy, user type, or embedding model). A sample query is shown below.
SELECT chunk_content,
vector_cosine_similarity(question_vector, EMBEDDING) AS similarity,
document_name
FROM table_name
WHERE chunking_strategy = 'Semantic chunking'
AND internal_only = FALSE
AND embedding_model = 'nomic_embed_text'
ORDER BY similarity DESC
LIMIT 10;
This query retrieves the top 10 most similar chunks to a user's question, using cosine similarity relative to the supplied EMBEDDING value. It can be supplied in a Qarbine query specification using a simple variable
vector_cosine_similarity(question_vector, @myEmbedding)
or dynamically determined using a macro function call to a Qarbine AI assistance via
vector_cosine_similarity(question_vector,
[! embedding(@userInput, 'myAiAssistance’) !] )
For more details see the Firebolt documentation at
https://www.firebolt.io/faq/how-does-firebolts-vector-search-compare-to-dedicated-vector-databases
Miscellaneous Queries
The following queries are general DBA oriented ones.
SHOW DATABASES
SHOW TABLES
SHOW COLUMNS user_profile
Troubleshooting
You can obtain the low level query being sent from Qarbine to Firebolt by pressing the ALT key and clicking the run icon.
Firebolt’s Develop Space can be used to run SQL queries interactively. See the following page for details https://docs.firebolt.io/guides/run-queries/using-the-develop-workspace